首页> 外文OA文献 >Latent Tree Models for Hierarchical Topic Detection
【2h】

Latent Tree Models for Hierarchical Topic Detection

机译:用于分层主题检测的潜在树模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present a novel method for hierarchical topic detection where topics areobtained by clustering documents in multiple ways. Specifically, we modeldocument collections using a class of graphical models called hierarchicallatent tree models (HLTMs). The variables at the bottom level of an HLTM areobserved binary variables that represent the presence/absence of words in adocument. The variables at other levels are binary latent variables, with thoseat the lowest latent level representing word co-occurrence patterns and thoseat higher levels representing co-occurrence of patterns at the level below.Each latent variable gives a soft partition of the documents, and documentclusters in the partitions are interpreted as topics. Latent variables at highlevels of the hierarchy capture long-range word co-occurrence patterns andhence give thematically more general topics, while those at low levels of thehierarchy capture short-range word co-occurrence patterns and give thematicallymore specific topics. Unlike LDA-based topic models, HLTMs do not refer to adocument generation process and use word variables instead of token variables.They use a tree structure to model the relationships between topics and words,which is conducive to the discovery of meaningful topics and topic hierarchies.
机译:我们提出了一种用于分层主题检测的新颖方法,其中通过以多种方式对文档进行聚类来获得主题。具体来说,我们使用称为分层潜在树模型(HLTM)的一类图形模型对文档集合进行建模。 HLTM底层的变量是观察到的二进制变量,它们表示文档中单词的存在与否。其他级别的变量是二进制潜在变量,最低潜在级别的变量表示单词共现模式,更高潜在级别的变量表示以下级别的模式共现。每个潜在变量对文档和文档簇进行软划分在分区中被解释为主题。层次结构较高级别的潜在变量捕获了远程单词共现模式,因此在主题上给出了更广泛的主题,而层次结构较低级别的潜在变量则捕获了短期单词共现模式并从主题上给出了更具体的主题。与基于LDA的主题模型不同,HLTM不引用文档生成过程,而是使用单词变量代替标记变量。它们使用树结构来建模主题和单词之间的关系,这有助于发现有意义的主题和主题层次结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号